Non-Negative Matrix Factorization with Constraints
نویسندگان
چکیده
Non-negative matrix factorization (NMF), as a useful decomposition method for multivariate data, has been widely used in pattern recognition, information retrieval and computer vision. NMF is an effective algorithm to find the latent structure of the data and leads to a parts-based representation. However, NMF is essentially an unsupervised method and can not make use of label information. In this paper, we propose a novel semi-supervised matrix decomposition method, called Constrained Non-negative Matrix Factorization, which takes the label information as additional constraints. Specifically, we require that the data points sharing the same label have the same coordinate in the new representation space. This way, the learned representations can have more discriminating power. We demonstrate the effectiveness of this novel algorithm through a set of evaluations on real world applications. Introduction Dimensionality reduction techniques have been receiving more and more attentions as fundamental tools for data representation (Lee and Seung 1999; He, Cai, and Min 2005; Min, Lu, and He 2004; He et al. 2005). Among them, matrix decomposition approaches have been developed by using different criteria. The most popular techniques include Principal Component Analysis (PCA), Singular Value Decomposition (SVD) and Vector Quantization. Central to the matrix factorization is to find two or more matrix factors whose product is a good approximation to the original matrix. In real applications, the dimension of the decomposed matrix factors is usually much smaller than that of the original matrix. This gives rise to compact representations of the data points which can facilitate other learning tasks such as clustering and classification. Among matrix factorization methods, Non-negative Matrix Factorization (NMF)(Lee and Seung 1999; Li and Ding 2006) specializes in that it enforces the constraint that the factor matrices must be non-negative, i.e., all elements must be equal to or greater than zero. This non-negative constraint leads NMF to a parts-based representation of the object in the sense that it only allows additive, not subtractive Copyright c © 2010, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. combination of the components. Therefore, it is an ideal dimensionality reduction algorithm for image processing, face recognition (Lee and Seung 1999; Li et al. 2001) and document clustering (Xu, Liu, and Gong 2003), where it is natural to consider the object as a combination of parts to form a whole. As in the scope of non-negative matrix factorization, the related work includes pLSA (Hofmann 2001), coclustering (Dhillon, Mallela, and Modha 2003) etc. NMF is an unsupervised learning algorithm. That is, NMF is inapplicable to many real-world problems where limited knowledge from domain experts is available. However, many machine learning researchers have found that unlabeled data, when used in conjunction with a small amount of labeled data, can produce considerable improvement in learning accuracy (Chapelle, Schölkopf, and Zien 2006; He 2010). The cost associated with the labeling process may render a fully labeled training set infeasible, whereas acquisition of a small set of labeled data is relatively inexpensive. In such situations, semi-supervised learning can be of great practical value. Therefore, It would be great benefit to extend the usage of NMF to a semi-supervised manner. Recently, Cai et al. (Cai et al. 2008; 2009) proposed a Graph regularized NMF (GNMF) approach to encode the geometrical information of the data space. GNMF constructs a nearest neighbor graph to model the local manifold structure. When label information is available, it can be naturally incorporated into the graph structure. Specifically, if two data points share the same label, a large weight can be assigned to the edge connecting them. If two data points have the different labels, the corresponding weight is set to be 0. This gives rise to semi-supervised GNMF. The major disadvantage of this approach is that there is no theoretical guarantee that data points from the same class will be mapped together in the new representation space, and it remains unclear how to select the weights in a principled manner. In this paper, we propose a novel matrix decomposition method, called Constrained Non-negative Matrix Factorization (CNMF), which takes the label information as additional hard constraints. The central idea of our approach is that the data points from the same class should be merged together in the new representation space. Thus, the obtained parts-based representation has the consistent label with the original data, and therefore can have more discriminating 506 Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10)
منابع مشابه
Iterative Weighted Non-smooth Non-negative Matrix Factorization for Face Recognition
Non-negative Matrix Factorization (NMF) is a part-based image representation method. It comes from the intuitive idea that entire face image can be constructed by combining several parts. In this paper, we propose a framework for face recognition by finding localized, part-based representations, denoted “Iterative weighted non-smooth non-negative matrix factorization” (IWNS-NMF). A new cost fun...
متن کاملA new approach for building recommender system using non negative matrix factorization method
Nonnegative Matrix Factorization is a new approach to reduce data dimensions. In this method, by applying the nonnegativity of the matrix data, the matrix is decomposed into components that are more interrelated and divide the data into sections where the data in these sections have a specific relationship. In this paper, we use the nonnegative matrix factorization to decompose the user ratin...
متن کاملVoice-based Age and Gender Recognition using Training Generative Sparse Model
Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...
متن کاملOn the stability of multiplicative update algorithms. Application to non-negative matrix factorization Sur la stabilité des règles de mises à jour multiplicatives. Application à la factorisation en matrices positives
Multiplicative update algorithms have encountered a great success to solve optimization problems with nonnegativity constraints, such as the famous non-negative matrix factorization and its many variants. However, despite several years of research on the topic, the understanding of their convergence properties is still to be improved. In this paper, we show that Lyapunov’s stability theory prov...
متن کاملFast Non-negative Dimensionality Reduction for Protein Fold Recognition
In this paper, dimensionality reduction via matrix factorization with nonnegativity constraints is studied. Because of these constraints, it stands apart from other linear dimensionality reduction methods. Here we explore nonnegative matrix factorization in combination with a classifier for protein fold recognition. Since typically matrix factorization is iteratively done, convergence can be sl...
متن کاملMatrix factorization with binary components
Motivated by an application in computational biology, we consider low-rank matrix factorization with {0, 1}-constraints on one of the factors and optionally convex constraints on the second one. In addition to the non-convexity shared with other matrix factorization schemes, our problem is further complicated by a combinatorial constraint set of size 2m·r, where m is the dimension of the data p...
متن کامل